Discovering Statistically Significant Co-location Rules in Datasets with Extended Spatial Objects

نویسندگان

  • Jundong Li
  • Osmar R. Zaïane
  • Alvaro Osornio-Vargas
چکیده

Co-location rule mining is one of the tasks of spatial data mining, which focuses on the detection of sets of spatial features that show spatial associations. Most previous methods are generally based on transaction-free apriori-like algorithms which are dependent on userdefined thresholds and are designed for boolean data points. Due to the absence of a clear notion of transactions, it is nontrivial to use association rule mining techniques to tackle the co-location rule mining problem. To solve these difficulties, a transactionization approach was recently proposed; designed to mine datasets with extended spatial objects. A statistical test is used instead of global thresholds to detect significant co-location rules. One major shortcoming of this work is that it limits the size of antecedent of co-location rules up to three features, therefore, the algorithm is difficult to scale up. In this paper we introduce a new algorithm that fully exploits the property of statistical significance to detect more general co-location rules. We use our algorithm on real datasets with the National Pollutant Release Inventory (NPRI). A classifier is also proposed to help evaluate the discovered co-location rules.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discovering Co-location Patterns in Datasets with Extended Spatial Objects

Co-location mining is one of the tasks of spatial data mining, which focuses on the detection of the sets of spatial features frequently located in close proximity of each other. Previous work is based on transaction-free apriori-like algorithms. The approach we propose is based on a grid transactionization of geographic space and designed to mine datasets with extended spatial objects. A stati...

متن کامل

On Discovering Co-Location Patterns in Datasets: A Case Study of Pollutants and Child Cancers

We intend to identify relationships between cancer cases and pollutant emissions by proposing a novel co-location mining algorithm. In this context, we specifically attempt to understand whether there is a relationship between the location of a child diagnosed with cancer with any chemical combinations emitted from various facilities in that particular location. Colocation pattern mining intend...

متن کامل

A multiple window-based co-location pattern mining approach for various types of spatial data

Studies on spatial co-location mining required distance threshold to define spatial neighbourhood (Shashi Shekhar and Yan Huang(2001); Yoo and Shekhar (2004, 2006); Yasuhiko Morimoto(2001); Koperski and Han(1995); Ding et al. (2008)) However, it is problematical for users to choose suitable threshold values because they lack prior knowledge about spatial data. Spatial neighbourhood has been def...

متن کامل

New Methods for Mining Sequential and Time Series Data

Data mining is the process of extracting knowledge from large amounts of data. It covers a variety of techniques aimed at discovering diverse types of patterns on the basis of the requirements of the domain. These techniques include association rules mining, classification, cluster analysis and outlier detection. The availability of applications that produce massive amounts of spatial, spatio-t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014